Stochastic Distributional Models for Textual Information Retrieval

نویسندگان

  • Martin Rajman
  • Romaric Besançon
چکیده

The objective of this paper is to present a textual similarity model for Information Retrieval (IR) based on the Distributional Semantic (DS) model. This model is an extension of the standard Vector Space model, which further takes into account the co-frequencies between the terms in a given reference corpus, that are considered to provide a distributional representation of the "semantics" of the terms. Practical retrieval experiments using DSbased similarity models have been conducted in the framework of the AMARYLLIS evaluation campaign. The results obtained are presented, and indicate significant improvement of the performance in comparison with the standard approach. keywords :Textual similarity, Information Retrieval, Distributional Semantics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Textual Similarities Based on a Distributional Approach

The design of efficient textual similarities is an important issue in the domain of textual data exploration. Textual similarities are for example central in document collection structuring (e.g. clustering), or in Information Retrieval (IR) which relies on the computation of textual similarities for measuring the adequacy between a query and documents. The objective of this paper is to present...

متن کامل

Care episode retrieval: distributional semantic models for information retrieval in the clinical domain

Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and r...

متن کامل

Arabic Textual Entailment with Word Embeddings

Determining the textual entailment between texts is important in many NLP tasks, such as summarization, question answering, and information extraction and retrieval. Various methods have been suggested based on external knowledge sources; however, such resources are not always available in all languages and their acquisition is typically laborious and very costly. Distributional word representa...

متن کامل

An Architecture for Scientific Document Retrieval: Using Textual and Math Entailment Modules

We present an architecture for scientific document retrieval. An existing system for textual and math-ware retrieval Math Indexer and Searcher MIaS is designed for extensions by modules for textual and math-aware entailment. The goal is to increase quality of retrieval (precision and recall) by handling natural languge variations of expressing semantically the same in texts and/or formulae. Ent...

متن کامل

Distributional Semantics in Technicolor

Our research aims at building computational models of word meaning that are perceptually grounded. Using computer vision techniques, we build visual and multimodal distributional models and compare them to standard textual models. Our results show that, while visual models with state-of-the-art computer vision techniques perform worse than textual models in general tasks (accounting for semanti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004